Discussion on 5th May 2010
The AML team comprising of Dr. Sajjad , Saleha, Osama ,Asma and Farah had their first meeting in which Osama and Farah were briefed about the project.
Discussion on April 24, 2010 (Saturday)
A meeting was arranged for AML Issues discussion wit h Mr. Nauman Sheikh and two of the CreditChux team menbers. Following are the issues discussed so far:
A Basic Banking Account is an account opened by the bank for mostly two purposes one is incase loan is approved then bank opens a deposit account and links it to the credit account. Other is for employees who would avail minimal facitlities no ATM etc. On the other end a Value account is for high ended customers
HBL DPA account number is three and credit amount is very much should we take the outlier or not? At tis moment we will not remove them first we will take input from a bank official then we will decide
We can discard locker account
Discard all the account types and take the top three accounts. Basic Banking ,PLS saving and Current accounts into consideration
Remove all those whose avg amount is less than 1000
Remove all those # of transactions in which transaction count is less than 100
Blank ages have not been provided
Interesting pattern is observed inages between 48-55 avg 10 transactions
Interesting pattern is observed in Karachi North the avg number of transactions are much as compared to other regions
Bank Draft Proceeds is Credit and Issued is Debit
Discussion on April 21, 2010 (Wednesday)
The following steps will be performed:
Clean the data by removing the irrelevant account types and transaction codes and then recompute the previously agreed variables.
Run clustering algorithms on the cleaned (and selected training) data set and store clusters specification (average values, lower and upper bound of each attribute). To begin with, we can only focus on K-Means.
Assign each customer to a cluster.
Process the test data sequentially and compare each record against its corresponding cluster and if the record is considered an outlier, raise a flag.
The flagged records should be a small percentage (2--5%) of the test data. Adjust the clusters' specifications and outlier metrics accordingly to meet this percentage.
Discussion on April 7, 2010 (Wednesday)
It was decided that in the beginning we would focus on the following variables:
Average Monthly Withdrawal, Average Monthly Deposit, Average Number of Withdrawal Transactions, Average Number of Deposit Transactions, Average Start of the Month Withdrawal, Average Start of the Month Deposit, Average End of the Month Withdrawal, Average End of the Month Deposit
Branch Code, Birth Year, Account Type, Emp_Ind, Gender, NTC_NBR_GVN_IND, Max Year to Date Balance
The continuous attributes would be discretized first and then K-Means would be applied to form K clusters.
The benchmark in the beginning is SARs (Suspicious Activities Reports) generated by the bank. Our first goal is to replicate (or even improve) the SARs generation process.
In the next step, we would focus on those aggregate variables that capture the sequence aspect too. For instance,
Average Delay in Two Consecutive Withdrawals, Average Delay in Two Consecutive Deposits, Average Ratio between Two Consecutive Deposits, Average Ratio between Two Consecutive Withdrawals, etc.
Discussion on 5th May 2010
The AML team comprising of Dr. Sajjad , Saleha, Osama ,Asma and Farah had their first meeting in which Osama and Farah were briefed about the project.
Discussion on April 24, 2010 (Saturday)
A meeting was arranged for AML Issues discussion wit h Mr. Nauman Sheikh and two of the CreditChux team menbers. Following are the issues discussed so far:
Discussion on April 21, 2010 (Wednesday)
Discussion on April 7, 2010 (Wednesday)